624 research outputs found
Light Field Super-Resolution Via Graph-Based Regularization
Light field cameras capture the 3D information in a scene with a single
exposure. This special feature makes light field cameras very appealing for a
variety of applications: from post-capture refocus, to depth estimation and
image-based rendering. However, light field cameras suffer by design from
strong limitations in their spatial resolution, which should therefore be
augmented by computational methods. On the one hand, off-the-shelf single-frame
and multi-frame super-resolution algorithms are not ideal for light field data,
as they do not consider its particular structure. On the other hand, the few
super-resolution algorithms explicitly tailored for light field data exhibit
significant limitations, such as the need to estimate an explicit disparity map
at each view. In this work we propose a new light field super-resolution
algorithm meant to address these limitations. We adopt a multi-frame alike
super-resolution approach, where the complementary information in the different
light field views is used to augment the spatial resolution of the whole light
field. We show that coupling the multi-frame approach with a graph regularizer,
that enforces the light field structure via nonlocal self similarities, permits
to avoid the costly and challenging disparity estimation step for all the
views. Extensive experiments show that the new algorithm compares favorably to
the other state-of-the-art methods for light field super-resolution, both in
terms of PSNR and visual quality.Comment: This new version includes more material. In particular, we added: a
new section on the computational complexity of the proposed algorithm,
experimental comparisons with a CNN-based super-resolution algorithm, and new
experiments on a third datase
Image registration with sparse approximations in parametric dictionaries
We examine in this paper the problem of image registration from the new
perspective where images are given by sparse approximations in parametric
dictionaries of geometric functions. We propose a registration algorithm that
looks for an estimate of the global transformation between sparse images by
examining the set of relative geometrical transformations between the
respective features. We propose a theoretical analysis of our registration
algorithm and we derive performance guarantees based on two novel important
properties of redundant dictionaries, namely the robust linear independence and
the transformation inconsistency. We propose several illustrations and insights
about the importance of these dictionary properties and show that common
properties such as coherence or restricted isometry property fail to provide
sufficient information in registration problems. We finally show with
illustrative experiments on simple visual objects and handwritten digits images
that our algorithm outperforms baseline competitor methods in terms of
transformation-invariant distance computation and classification
Online Resource Inference in Network Utility Maximization Problems
The amount of transmitted data in computer networks is expected to grow
considerably in the future, putting more and more pressure on the network
infrastructures. In order to guarantee a good service, it then becomes
fundamental to use the network resources efficiently. Network Utility
Maximization (NUM) provides a framework to optimize the rate allocation when
network resources are limited. Unfortunately, in the scenario where the amount
of available resources is not known a priori, classical NUM solving methods do
not offer a viable solution. To overcome this limitation we design an overlay
rate allocation scheme that attempts to infer the actual amount of available
network resources while coordinating the users rate allocation. Due to the
general and complex model assumed for the congestion measurements, a passive
learning of the available resources would not lead to satisfying performance.
The coordination scheme must then perform active learning in order to speed up
the resources estimation and quickly increase the system performance. By
adopting an optimal learning formulation we are able to balance the tradeoff
between an accurate estimation, and an effective resources exploitation in
order to maximize the long term quality of the service delivered to the users
Distributed Representation of Geometrically Correlated Images with Compressed Linear Measurements
This paper addresses the problem of distributed coding of images whose
correlation is driven by the motion of objects or positioning of the vision
sensors. It concentrates on the problem where images are encoded with
compressed linear measurements. We propose a geometry-based correlation model
in order to describe the common information in pairs of images. We assume that
the constitutive components of natural images can be captured by visual
features that undergo local transformations (e.g., translation) in different
images. We first identify prominent visual features by computing a sparse
approximation of a reference image with a dictionary of geometric basis
functions. We then pose a regularized optimization problem to estimate the
corresponding features in correlated images given by quantized linear
measurements. The estimated features have to comply with the compressed
information and to represent consistent transformation between images. The
correlation model is given by the relative geometric transformations between
corresponding features. We then propose an efficient joint decoding algorithm
that estimates the compressed images such that they stay consistent with both
the quantized measurements and the correlation model. Experimental results show
that the proposed algorithm effectively estimates the correlation between
images in multi-view datasets. In addition, the proposed algorithm provides
effective decoding performance that compares advantageously to independent
coding solutions as well as state-of-the-art distributed coding schemes based
on disparity learning
Graph-based classification of multiple observation sets
We consider the problem of classification of an object given multiple
observations that possibly include different transformations. The possible
transformations of the object generally span a low-dimensional manifold in the
original signal space. We propose to take advantage of this manifold structure
for the effective classification of the object represented by the observation
set. In particular, we design a low complexity solution that is able to exploit
the properties of the data manifolds with a graph-based algorithm. Hence, we
formulate the computation of the unknown label matrix as a smoothing process on
the manifold under the constraint that all observations represent an object of
one single class. It results into a discrete optimization problem, which can be
solved by an efficient and low complexity algorithm. We demonstrate the
performance of the proposed graph-based algorithm in the classification of sets
of multiple images. Moreover, we show its high potential in video-based face
recognition, where it outperforms state-of-the-art solutions that fall short of
exploiting the manifold structure of the face image data sets.Comment: New content adde
Graph Signal Representation with Wasserstein Barycenters
In many applications signals reside on the vertices of weighted graphs. Thus,
there is the need to learn low dimensional representations for graph signals
that will allow for data analysis and interpretation. Existing unsupervised
dimensionality reduction methods for graph signals have focused on dictionary
learning. In these works the graph is taken into consideration by imposing a
structure or a parametrization on the dictionary and the signals are
represented as linear combinations of the atoms in the dictionary. However, the
assumption that graph signals can be represented using linear combinations of
atoms is not always appropriate. In this paper we propose a novel
representation framework based on non-linear and geometry-aware combinations of
graph signals by leveraging the mathematical theory of Optimal Transport. We
represent graph signals as Wasserstein barycenters and demonstrate through our
experiments the potential of our proposed framework for low-dimensional graph
signal representation
Joint Reconstruction of Multi-view Compressed Images
The distributed representation of correlated multi-view images is an
important problem that arise in vision sensor networks. This paper concentrates
on the joint reconstruction problem where the distributively compressed
correlated images are jointly decoded in order to improve the reconstruction
quality of all the compressed images. We consider a scenario where the images
captured at different viewpoints are encoded independently using common coding
solutions (e.g., JPEG, H.264 intra) with a balanced rate distribution among
different cameras. A central decoder first estimates the underlying correlation
model from the independently compressed images which will be used for the joint
signal recovery. The joint reconstruction is then cast as a constrained convex
optimization problem that reconstructs total-variation (TV) smooth images that
comply with the estimated correlation model. At the same time, we add
constraints that force the reconstructed images to be consistent with their
compressed versions. We show by experiments that the proposed joint
reconstruction scheme outperforms independent reconstruction in terms of image
quality, for a given target bit rate. In addition, the decoding performance of
our proposed algorithm compares advantageously to state-of-the-art distributed
coding schemes based on disparity learning and on the DISCOVER
Graph-Based Classification of Omnidirectional Images
Omnidirectional cameras are widely used in such areas as robotics and virtual
reality as they provide a wide field of view. Their images are often processed
with classical methods, which might unfortunately lead to non-optimal solutions
as these methods are designed for planar images that have different geometrical
properties than omnidirectional ones. In this paper we study image
classification task by taking into account the specific geometry of
omnidirectional cameras with graph-based representations. In particular, we
extend deep learning architectures to data on graphs; we propose a principled
way of graph construction such that convolutional filters respond similarly for
the same pattern on different positions of the image regardless of lens
distortions. Our experiments show that the proposed method outperforms current
techniques for the omnidirectional image classification problem
- …